home *** CD-ROM | disk | FTP | other *** search
- A Brief Retrospective on the Sprite Network Operating System
- ------------------------------------------------------------
-
- John Ousterhout
- Computer Science Division
- Department of EECS
- University of California at Berkeley
-
- Sprite is a research operating system whose development I have had the
- pleasure of leading over the last eight years. Four graduate students
- and I started the project in the Fall of 1984 because we felt that
- current operating systems were paying insufficient attention to local-area
- networks. It seemed to us that networking support had been added in a
- quick and dirty fashion to systems that were designed to run stand-alone.
- As a result, networked workstations didn't work well together. At the
- time we started Sprite there were no good network file systems (even NFS
- didn't exist yet) and administering a network of workstations was a
- nightmare.
-
- Our goal for Sprite was to "do networking right" by building a new operating
- system kernel from scratch and designing the network support into the system
- from the start. We hoped to create a system where a collection of
- networked workstations would behave like a single system with both storage
- and processing power shared uniformly among the workstations. We hoped
- that users would be able to tap the power of the entire network while
- preserving the simple behavior and administrative ease of a traditional
- time-shared system.
-
- I think that we achieved these goals. Four technical accomplishments
- stand out in my mind. First and foremost is Sprite's network file system,
- which demonstrated that network file systems can provide a convenient
- user model without sacrificing performance. Sprite's file system allows
- file sharing while completely hiding the network. It provides the same
- behavior among workstations that users would see if they all ran on a
- traditional time-sharing system. Even I/O devices can be accessed
- unformly across the network, and user processes can extend the file
- system by implementing its I/O and naming protocols using pseudo devices
- and pseudo file systems. At the same time, Sprite uses aggressive file
- caching to achieve high performance. Five years after its construction,
- Sprite still has the fastest network file system in existence.
-
- Sprite's second key accomplishment is its process migration mechanism,
- which allows processes to be moved transparently between workstations
- at any time. With process migration a single user can harness the power
- of many workstations simultaneously, achieving speedups of four or more
- on common system tasks such as recompilation. The migration mechanism
- keeps track of idle machines, uses them for migration, and evicts
- migrated processes when a workstation's owner returns, so that migration
- doesn't impact the response time of active users. Evicted processes are
- remigrated to a different idle machine or executed on their home machine.
- Sprite is one of only a few systems where process migration has been
- used on a day-to-day basis by a large user community.
-
- Sprite's third key accomplishment is its single system image. The file
- system and process migration provide the most obvious evidence of the
- single system image, since they make storage and processing power sharable
- among workstations. But in many other ways Sprite looks and feels just like
- a single system. There is one root partition, one password file, one swap
- area (in the network file system), one login database, and so on. The
- "finger" command tells about all users on all workstations in the Sprite
- cluster, not just those on the workstation where the command was invoked.
- System administration is no harder with fifty machines in the network than
- it is with ten, and adding a new machine is no more difficult than adding
- a new user account.
-
- Sprite's single system image also supports different machine architectures
- in the same cluster. We developed a framework for separating architecture-
- independent information from architecture-specific information. All the
- information for all architectures is visible at all times, which simplifies
- cross-development, yet each machine uses the appropriate architecture-
- dependent information when it is needed.
-
- Sprite's fourth key accomplishment is its log-structured file system (LFS),
- which demonstrates a radical new approach to file system design. LFS
- treats the disk more like a tape, writing information sequentially in large
- runs that permit great efficiency. We developed a new garbage collection
- mechanism that continually opens up large extents of free space on the
- disk. The result is a system that writes small files to disk an order of
- magnitude faster than any other existing file system. At the same time
- it handles other operations, such as reads and large writes, at least
- as well as other systems. Log-structured file systems have many other
- advantages as well, such as fast crash recovery, the ability to store
- information on disk in compressed form, and the ability to vary the block
- size from file to file.
-
- In addition to the above results, which have already been achieved, there
- are several promising projects still underway, such as recovery enhancements,
- a new file system called Zebra that stripes files across multiple servers,
- and high-bandwidth file service using disk arrays. Before the Sprite project
- ends I expect to see major results from each of these projects too.
-
- Throughout the Sprite project we have tried to characterize the behavior
- of the system and to use this information to guide future developments.
- Some of our most important results are the measurements we have made. The
- founding of the project was based in part on file system measurements made
- on time-shared systems in 1984 and 1985. We made additional measurements
- of Sprite usage in 1991 to see how patterns had changed and to analyze
- the potential application of non-volatile memory in networked systems.
-
- Perhaps the most significant accomplishment of all is that we were able
- to make the system work, not just for ourselves but for a community of
- users that numbered as high as 80 or more at the peak of the project.
- Many of these users depended on Sprite for all of their day-to-day
- computing needs, such as mail and printing. For a period of several
- years it was common to see 25-35 simultaneous logins of which only a
- half-dozen were Sprite developers. I know of only one other university
- project that developed a new operating system kernel from scratch and
- used it to support a user community this large for this long; that project
- was Multics, which was carried out at MIT in the late 1960's.
-
- Furthermore, we built Sprite (more than 200,000 lines of new code in all)
- with a small team that averaged only about four graduate students and one
- or two staff or undergraduate assistants. We never got too large to have
- project meetings in my office, although there were times when we had to
- borrow two additional chairs to supplement the six already in my office.
-
- The history of Sprite divides into roughly three phases: initial
- development, consolidation and LFS development, and new ventures
- and closeout.
-
- The first phase of Sprite, initial development, lasted from the founding
- of the project in the fall of 1984 until about the end of 1987. We began
- coding on Sun-2 workstations in early 1985 and had a system that could
- execute shell commands by the spring of 1986. In the summer of 1986 we
- started developing the "real" Sprite file system (we'd used an older
- network file system called BNFS up until that point). About that time
- we also started on process migration and porting the X window system.
- By the fall of 1987 all of these things were working, along with an
- internet protocol sever. We had also ported Sprite to Sun-3's. At this
- point we copied the kernel sources over to Sprite and began doing all of
- our kernel development on Sprite itself.
-
- The second phase in Sprite's history lasted from late 1987 to late 1990.
- This phase consisted mostly of consolidation. In early 1988 we made a
- major revision of the file system. Remote device support was improved,
- the pseudo device implementation was rewritten, and a simple recovery
- protocol was introduced so the system could recover gracefully from
- server crashes. Process migration underwent major improvements also,
- and by late 1988 it became stable enough for us to use it daily in system
- development. In 1988 we ported Sprite to the SPUR research multiprocessor
- (the SPUR project provided much of the early funding for Sprite), and
- in 1989 we ported it to DECstation-3100 and Sun-4 platforms. A port to
- the Sequent Symmetry machine was carred out at Sequent in late 1989 and
- early 1990.
-
- In late 1988 we began to support users other than Sprite developers. The
- user community gradually grew in size, peaking at around 80 in 1990 and
- 1991. We also prepared a distribution tape so that we could make Sprite
- available to people outside Berkeley. The first tapes were sent out in
- late 1989; over the life of the project Sprite has run at about ten
- different sites.
-
- The most significant new development during Sprite's second phase was
- the LFS implementation. We made preliminary designs and studies in 1988
- but didn't solidify the prototype design until 1989 (as part of the
- newly started RAID project). Coding started in late 1989. By the spring of
- 1990 LFS was showing signs of life, and it entered production use in the
- fall of 1990. By late 1991 virtually all of Sprite's dozen disks were
- using LFS.
-
- The final phase of the project started in late 1990 and will continue
- until Sprite shuts down in late 1993 or early 1994. In this phase we
- initiated several new projects, most of which reflected the increasing
- focus of the project on issues related to storage management. In the
- winter of 1990 we began to analyze the behavior of recovery after file
- server crashes; this led to a series of experiments with better recovery
- techniques, such as server-driven recovery and the use of non-volatile
- storage. In 1991 we began a project to see if Sprite could be
- re-implemented as a user-level server process running under the Mach 3.0
- kernel; this project completed in the summer of 1992 with substantial
- functionality but disappointing performance. In 1991 and 1992 we also
- developed the Jaquith tape library system, which made robotically-controlled
- tape systems available for both Sprite and other UNIX systems. During the
- same period we started projects to experiment with striping files across
- multiple file servers (Zebra) and to improve the bandwidth of remote
- accesses to disk arrays. Both of these projects are still underway.
-
- Like most software, the Sprite kernel became harder and harder to maintain
- as it aged. Frequent revisions and changes in project personnel led to
- increases in system complexity, in spite of our best efforts to keep
- things clean and simple. In addition, we found it harder and harder to
- keep up with developments in commercial operating systems. By 1990
- there were several commercial versions of UNIX with massive support teams,
- such as System V, Solaris, and OSF. These systems were adding features
- at a rapid pace and our users wanted access to these features under
- Sprite. We added new features such as shared libraries and binary
- compatibility with SunOS and Ultrix, but we found ourselves spending
- more and more time on tasks that were not research oriented.
-
- At the end of 1991 we decided to bring the Sprite project to a gradual
- close. Since then we have not started any major new developments and
- no new graduate students have joined the project. We plan to complete
- the projects that are currently underway and then shut down the system
- in late 1993 or early 1994. We no longer encourage new users to work on
- Sprite, and our user community is slowly shrinking back to just the Sprite
- team. Sprite has served us long and well as a research vehicle; now it is
- time to move on to other things.
-
- Many people have contributed to Sprite over the years. I can't possibly
- hope to list every significant contribution, but I'll try anyway. The
- list below summarizes the work of the principal project members (my
- research students and the staff who reported directly to me). The people
- are listed in chronological order by the date when they started working
- on Sprite-related things, and the projects are listed with the most
- important ones (in my opinion) first.
-
- Brent Welch (Summer 1984 - Spring 1990):
- Remote procedure calls; file system (storage manager, file system
- switches, prefixes, name lookup, remote devices, crash recovery
- protocol, disk formatting, dump and restore, migration support);
- pseudo devices; pseudo file systems; BNFS file system; NFS support;
- device drivers; kernel profiling; bootstrapping.
-
- Andrew Cherenson (Fall 1984 - Fall 1987):
- Internet protocol sever; window system design and X10 porting;
- timer; serial interfaces; user-level debugging; pseudo-devices;
- init process; command porting (network commands, login); process-related
- system calls; UNIX compatibility; manual entry formatting; C library.
-
- Fred Douglis (Fall 1984 - Fall 1990):
- Process migration and the pmake program; UNIX compatibility and
- command porting; system call interfaces; synchronization, scheduling,
- and process support; porting and support for major programs such
- as emacs and tex; trap handling and UART support for SPUR; early
- design work for log-structured file systems; experiments with
- optical disks.
-
- Mike Nelson (Fall 1984 - Fall 1988):
- Virtual memory; file system (caching, disk checker, vm interactions,
- migration support, crash recovery, select); kernel debugging;
- Sun-3 and DECstation-3100 ports; device and network drivers;
- signal handling; SPUR port (virtual memory, trap handlers, etc.);
- command porting.
-
- John Ousterhout (Fall 1984 - ??):
- Memory allocation; C library; terminal driver; context switching;
- gcc porting, mkmf program.
-
- Adam de Boor (Fall 1986 - Summer 1988):
- Pmake program; X11 porting (e.g. device drivers and region code);
- xman and mkmf programs; swat debugger.
-
- Mendel Rosenblum (Winter 1988 - Fall 1992):
- Log-structured file system; SPUR port; debugging tools; disk drivers;
- Sun-4 port; X11R4 porting.
-
- Mary Baker (Winter 1988 - ??):
- Recovery analysis and redesign; congestion control for remote procedure
- calls; recovery box; Sun-4, SPARCstation, SPARCstation-2 ports; file
- system measurements; analysis of NVRAM uses; RPC byte-swapping; SCSI
- device driver; multi-processor conversion.
-
- Bob Bruce (Fall 1988 - Fall 1990, Fall 1991 - Winter 1992):
- Spooler daemons; dump and restore utilities; Sprite distribution tape;
- support for compilation and debugging tools; user-level profiling;
- floating-point support; DECstation-3100 and Sequent ports; UNIX
- compatibility; Operation Desert Storm support; X11R5 port.
-
- John Hartman (Fall 1988 - ??):
- Zebra striped file system; file system measurements; port to SPUR
- multiprocessor; measurements of Sprite running on SPUR multiprocessor;
- device and network drivers (FDDI and Ultranet); synchronization
- analysis; disk checker, dumps, and other disk utilities; multiprocessor
- support; bootstrapping; debugger support; LFS support; scheduler.
-
- Don Reeves (Spring 1989):
- ARP and reverse ARP.
-
- Ken Shirriff (Summer 1989 - ??):
- High-speed file transfer using RAID; file system traces; analysis
- of name caching; shared virtual memory; mapped files; System-V
- synchronization; security enforcement; mail support; UNIX
- compatibility; dynamic linking; network daemons; DECstation-3100
- command porting; dump/restore utilities.
-
- Mike Kupfer (Summer 1990 - Summer 1992):
- Sprite as Mach server process; measurements for Sprite analysis paper;
- internal error checks in kernel; support for pmake, X, dump/restore
- utilities, and other administrative tools.
-
- Jim Mott-Smith (Winter 1991 - Fall 1992):
- Jaquith tape archive system; SCSI drivers; NFS support; internet
- protocol server support; dump/restore utilities; sendmail support.
-
- Geoff Voelker (Fall 1991 - Summer 1992):
- Disk utilities; FDDI network driver; network utilities.
-
- Matt Secor (Summer 1992):
- Debugger support.
-
- In addition to the people listed above, there were many others who made
- significant contributions to Sprite even though they didn't report
- directly to me. Here are a few of the "outside helpers"; apologies to
- anyone that I've overlooked.
-
- Bob Beck (Sequent port), Ann Chervenak (device drivers), Doug Johnson
- (SPUR debugging), Ed Lee (RAID striping driver), Dean Long (kernel bug
- fixing, bootstrapping, SPARCstation port), Ethan Miller (RAID controller
- support) , Srinivasan Seshan (Ultranet support), Thorsten Von Eicken
- (X11R4 port), Jay Vosburgh (Sequent port).
-